Longest Common Subsequence Algorithm

The Longest Common Subsequence (LCS) Algorithm is a powerful approach used in computer science, bioinformatics, and data analysis to find the longest sequence of characters or elements that two or more sequences have in common. This dynamic programming technique is particularly useful in comparing strings or sequences, and it plays a crucial role in various applications such as DNA sequence alignment, file comparison, and natural language processing. The LCS problem can be defined as follows: given two sequences X and Y, find the longest subsequence that is common to both X and Y without altering the order of elements in the original sequences.

The LCS algorithm essentially breaks the problem down into smaller subproblems and uses a tabular method to store the solutions of these subproblems in a bottom-up manner. It starts by constructing a matrix where the rows represent the elements of one sequence and the columns represent the elements of the other sequence. The algorithm then iterates through the matrix, comparing the corresponding elements of the two sequences. If the elements match, it adds 1 to the value in the diagonal cell above and to the left, indicating that the common subsequence has been extended by one character. If the elements do not match, the algorithm takes the maximum value from either the cell above or the cell to the left. The bottom-right cell of the matrix will then contain the length of the longest common subsequence, and the subsequence itself can be reconstructed by backtracking through the matrix, following the path of maximum values.

//Longest common subsequence - Dynamic Programming
#include <iostream>
using namespace std;

void Print(int trace[20][20], int m, int n, string a)
{
    if (m == 0 || n == 0)
    {
        return;
    }
    if (trace[m][n] == 1)
    {
        Print(trace, m - 1, n - 1, a);
        cout << a[m - 1];
    }
    else if (trace[m][n] == 2)
    {
        Print(trace, m - 1, n, a);
    }
    else if (trace[m][n] == 3)
    {
        Print(trace, m, n - 1, a);
    }
}

int lcs(string a, string b)
{
    int m = a.length(), n = b.length();
    int res[m + 1][n + 1];
    int trace[20][20];

    // fills up the arrays with zeros.
    for (int i = 0; i < m + 1; i++)
    {
        for (int j = 0; j < n + 1; j++)
        {
            res[i][j] = 0;
            trace[i][j] = 0;
        }
    }

    for (int i = 0; i < m + 1; ++i)
    {
        for (int j = 0; j < n + 1; ++j)
        {
            if (i == 0 || j == 0)
            {
                res[i][j] = 0;
                trace[i][j] = 0;
            }

            else if (a[i - 1] == b[j - 1])
            {
                res[i][j] = 1 + res[i - 1][j - 1];
                trace[i][j] = 1; // 1 means trace the matrix in upper left diagonal direction.
            }
            else
            {
                if (res[i - 1][j] > res[i][j - 1])
                {
                    res[i][j] = res[i - 1][j];
                    trace[i][j] = 2; // 2 means trace the matrix in upwards direction.
                }
                else
                {
                    res[i][j] = res[i][j - 1];
                    trace[i][j] = 3; //  means trace the matrix in left direction.
                }
            }
        }
    }
    Print(trace, m, n, a);
    return res[m][n];
}

int main()
{
    string a, b;
    cin >> a >> b;
    cout << lcs(a, b);
    return 0;
}

Longest Common Subsequence Algorithm

LANGUAGE:

DARK MODE:

PROGRAMMING LANGUAGES:

Python

Java

Javascript

C#

C++

C

Ruby

Scala

MATLAB

Kotlin

Rust

R

Go